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•^T ■ In this paper, we study small perturbations of a class of non-convex integrable Hamil- 

tonians with two degrees of freedom, and we prove a result of diffusion for an open and 
dense set of perturbations, with an optimal time of diffusion which grows linearly with 
respect to the inverse of the size of the perturbation. 
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Abstract 



Q ! 1 Introduction and statement of the result 

1.1 Introduction 

In this paper, we consider small perturbations of integrable Hamiltonian systems which are 
defined by a Hamiltonian function of the form 

>: *<MHM/) + £ /(M). H-) £ r«r, o<«i, 

where n > 2 is an integer and T n = IR n /Z". When e = 0, H = h is integrable in the sense 
that the action variables I(t) of all solutions (6(t),I(t)) of the system associated to h are first 

_il ■ integrals, I(t) = 1(0) for all times t E M. The sets I = Iq, for Iq £ R n , are thus invariant 

tori of dimension n in the phase space T n x R n , which moreover carry quasi-periodic motions 
with frequency U)(Io) = 'Vh(Io), that is 9(t) = 9(0) + toj(Io) modulo Z n . From now on we will 
assume that the small parameter e is non-zero, in which case the system defined by H can 
be considered as an e-perturbation of the integrable system defined by h. 

^ In the sixties, Arnold conjectured that for a generic h, the following phenomenon should 

occur: "for any points I' and I" on the connected level hypersurface of h in the action space 
there exist orbits connecting an arbitrary small neighbourhood of the torus 1 = 1' with an 
arbitrary small neighbourhood of the torus I = I", provided that e is sufficiently small and 
that / is generic" (see [Arn94j). This is a strong form of instability. A weaker form of this 
conjecture would be to ask for the existence of orbits for which the variation of the actions 
is of order one, that is bounded from below independently of e for all e sufficiently small. 
To support his conjecture, Arnold gave an example in [Arn64j where this weaker form of 
instability is satisfied, with n = 2, h convex and / a specific time-periodic perturbation (so 
this is equivalent to n = 3, h quasi-convex and / a specific time-independent perturbation). 
The phenomenon highlighted in [Arn64| is now known as Arnold diffusion. 
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Obstructions to Arnold diffusion, and to any form of instability in general, are widely 
known following the works of Kolmogorov and Arnold on the one hand, and the work of 
Nekhoroshev on the other hand. In [Kol54j, Kolmogorov proved that for a non-degenerate h 
and for all /, the system defined by H still has many invariant tori, provided it is analytic and 
e is small enough. What he showed is that among the set of unperturbed invariant tori, there 
is a subset of positive measure (the complement of which has a measure going to zero when 
e goes to zero) who survives any sufficiently small perturbation, the tori being only slighted 
deformed. The non-degeneracy assumption on h is that at all points, the determinant of its 
Hessian matrix V 2 /i(I) is non-zero. Then, under a different non-degeneracy assumption on 
h, namely that the determinant of the square matrix 

\Vh{I) ) 

is non-zero at all points, Arnold proved in |Arn63a], |Arn63b| a similar statement but with a 
set of tori inside a fixed level hypersurface. In particular, for n = 2, a level hypersurface is 
3-dimensional and the complement of the set of invariant 2-dimensional tori is disconnected, 
and each connected component is bounded with a diameter going to zero as e goes to zero. 
As a consequence, it can be proved more precisely that for n = 2 and if h is non-degenerate 
in the sense of Arnold, along all solutions we have 

\I(t) - I(0)| <<Vi, teR, 

for some positive constant c. Therefore we have stability for all solutions and for all time. Now 
for any n > 2, and if h is either Kolmogorov or Arnold non-degenerate, we have perpetual 
stability only for most solutions, those lying on invariant tori, and Arnold's example shows 
that this cannot be true for all solutions. The consequence of these results is that Arnold 
diffusion cannot exist for n = 2 if h is Arnold non-degenerate, and for n > 2 and h Kolmogorov 
or Arnold non-degenerate, the unstable solution, if it exists, must live in a set of relatively 
small measure. In an other direction, in the seventies Nekhoroshev proved ( |Nek77j . |Nek79] ) 
that for any n > 2, for a non-degenerate h and for all /, along all solutions we have 

\I(t) - /(0)| < Cl £ b , \t\ < exp {c 2 £~ a ) , 

for some positive constant c\, C2, a and b, provided e is small enough and the system analytic. 
So solutions which do not lie on invariant tori are stable not for all time, but during an 
interval of time which is exponentially long with respect to some power of the inverse of e. 
The consequence on Arnold diffusion is that the time of diffusion, that is the time it takes 
for the action variables to drift independently of e, is exponentially large. The integrable 
systems non-degenerate in the sense of Nekhoroshev, which are called steep, were originally 
quite complicated to define, but an equivalent definition was found in |Ily86| and |Nie06j : h is 
steep if and only if its restriction to any affine subspace has only isolated critical points. Such 
functions can be proved to be generic in a rather strong sense ( [Nek73| ). and the simplest 
(and also steepest) functions are the convex or quasi-convex ones (convex or quasi-convex 
functions are those for which the stability exponent a in Nekhoroshev estimates is the best). 
Note that convex (respectively quasi-convex) functions are Kolmogorov (respectively Arnold) 
non-degenerate . 

So the results of Kolmorogov, Arnold and Nekhoroshev restrict the possibility of diffusion, 
both in space and in time, at least provided the corresponding non-degeneracy assumptions 



are met. Following the original insight of Arnold in [Arn64], much study have been devoted to 
perturbations of a special class of Hamiltonian systems, which are called "a priori" unstable, 
where these restrictions are much less stringent. We won't try to give a precise definition of 
"a priori" unstable systems, but these systems are integrable in the larger sense of symplectic 
geometry (they have n first integrals in involution and independent almost everywhere) but 
display hyperbolic features (typically they have a normally hyperbolic invariant manifold), 
and by opposition, the systems we are considering are called "a priori" stable. These simpler 
"a priori" unstable systems are now well-understood, and many results confirm that instability 
occurs for a generic perturbation, see for instance [Tre04| . |CY04| . [DdlLS06], [GR07J, |Ber08| . 
|DH09| . |CY09j . and |GR09| . 

The situation for "a priori" stable systems is much more complicated. In |Mat04 (see 
also [Mat 12 j for a recent corrected version), Mather announced a proof of Arnold conjec- 
ture in a special case, that is a strong form of Arnold diffusion for a generic time-dependent 
perturbation of a convex integrable Hamiltonian with n = 2 (and also for a generic time- 
independent perturbation of a quasi-convex integrable Hamiltonian with n = 3) based on his 
variational techniques. Mather never gave a complete proof of the announced results, but his 
work and unpublished preprints played a fundamental role in the subsequent developments. 
First, Bernard, Kaloshin and Zhang in [BKZllj proved a weaker form of Arnold conjecture, 
still with the convexity requirement but for an arbitrary number of degrees of freedom. Then, 
Kaloshin and Zhang in [KZ12] proved the strong form of Arnold conjecture, for n = 2, h con- 
vex and / time-periodic. Similar results were independently obtained by Cheng ( [Chel2] ) and 
announced by Marco ([Mar 12a], [Max 12b]). The central and common point in all these works, 
which was not present in the work of Mather, is the use of normally hyperbolic invariant man- 
ifolds as a "skeleton" for the unstable orbits. On the other hand, most of these works do rely 
strongly on Mather's variational techniques, once the normally hyperbolic invariant manifolds 
have been constructed. It has to be noted that these variational techniques, and to a lesser 
extent, the existence of normally hyperbolic cylinders, use in an essential way the convexity 
assumption, so that none of these works apply to non-convex integrable Hamiltonians. It can 
be said that a typical non-degenerate integrable system (in the sense of Kolmogorov, Arnold 
or Nekhoroshev) is non-convex nor quasi-convex, but for these systems, essentially nothing 
is known: for the simplest integrable Hamiltonians h which are non-convex nor quasi-convex 
but steep and non-degenerate (in the sense of Kolmogorov for n > 2 or Arnold for n > 3), it 
is not even known how to construct a single / such that H = h + ef has unstable orbits. This 
is a bit paradoxical from the point of view of Nekhoroshev estimates, as the time of diffusion 
for perturbations of steep non-convex integrable Hamiltonians should be smaller and hence 
diffusion should be easier to observe. 

Yet for some non-convex non-steep integrable Hamiltonians, the construction of examples 
is much easier and has been known for a long time. A prototype of such an integrable 
Hamiltonian with two degrees of freedom, which can be found in [Nek77] (but a completely 
analogous example, in a slightly different setting, was already considered in [Mos60j), is given 
by h(h,I 2 ) = 1(1? - ID- letting f(6 1 ,8 2 ) = (2ir)- 1 sin(27r(#i - 2 )), the system H = h + ef 
admits the unstable solution I(t) = (—et,et), 9(t) = — -^{et 2 , et 2 ) . This Hamiltonian h is 
obviously non-convex, but it is also non-steep since the restriction of h to the lines {i"i±/2 = 0} 
is constant so this restriction has only critical points, which are thus non-isolated. Also it 
is degenerate in the sense of Arnold, so diffusion can and do already occur for n = 2, even 
though it is non degenerate in the sense of Kolmogorov so that it admits many invariant tori 
(circles). Moreover, the time of diffusion in this example is the smallest possible, as it is linear 



with respect to the inverse of e. 

The example above is rather specific, as / does not depend on the action variables, but 
more importantly, it depends only on a specific combination of the angular variables. The 
purpose of this paper is to investigate the question whether such a phenomenon remains true 
for a generic perturbation. We will show in Theorem ll.il §1.21 that we have diffusion for a 
class of non-convex non-steep Hamiltonians h with two degrees of freedom, which includes 
the example h(I±,l2) = \(I\_ — ^f) as a particular case, and for an open and dense set of 
perturbations, with a time of diffusion which is linear with respect to the inverse of e. The 
conditions defining this class of integrable Hamiltonians h is the existence of a line with 
rational slope such that h is constant along this line (which means that the gradient of the 
restriction of h to this line vanishes), but its gradient V/i is not identically zero along this 
line (these are the assumptions (A.l) and (A.2) in £ jl,2p . For integrable Hamiltonians which 
are compatible with "fast" diffusion (that is, with a time of diffusion which is linear with 
respect to the inverse of e) for some perturbation, we expect these conditions to be quite 
sharp. The set of admissible perturbations is very easily described: we only required that 
some "averaged" perturbation is a non-constant function. Moreover, we only require h to be 
C 4 and / to be C 3 . As for the proof, it is in fact rather simple. The only ingredient is a 
normal form, in the spirit of [BKZllj . which is valid on a domain in the action space whose 
size is independent of e, even though, unlike [BKZllj where the normal form is used for an 
other purpose (namely to construct a normally hyperbolic invariant cylinder which is then 
used to locate an unstable orbit), we need a slightly stronger statement in order to derive the 
existence of an unstable orbit directly from the normal form. 

To conclude, let us note that the statement of Theorem 11.11 gives a diffusion in a weak 
sense, that is the action variables drift independently of e for all e sufficiently small, but we 
cannot find an orbit which connects arbitrary neighbourhoods in the space of action. Also, 
for the moment, it is restricted to two degrees of freedom, which is the minimal number of 
degrees of freedom for which instability can occur for Arnold degenerate integrable systems. 
The normal form we used is in fact valid for any number of degrees of freedom, but in 
general it appears too weak to derive the result directly from it, and therefore we expect 
that additional restrictions on the set of admissible perturbations has to be imposed for more 
degrees of freedom. We plan to come back to these issues in a subsequent work. 

1.2 Main result 

Let us now state precisely the main result of the paper. Given R > 0, let Br be the closed ball 
of R 2 of radius R with respect to the supremum norm | . |, that is Br = {(Ii, I2) 6R 2 | |/i| < 
R-, 1-^2 1 5; R}- Our integrable Hamiltonian h will be a function h : Br -^Rof class C 4 , which 
satisfy the following two conditions: 

(A.l) There exist a vector k = (fci,^) £ Z 2 \ {0} and a constant a £ R such that if 
L = {(ii, I2) G R 2 I kill + /C2/2 + a = 0}, then the restriction of h to the line L is constant. 

(A2) There exists a point I* € L n B R *, for some < R* < R, such that Vh(P) / 0. 

Note that the condition (A.l) obviously rules out convex functions, but it also rules out 
steep functions. Indeed, (A.l) is equivalent to the assertion that the gradient of h\L vanishes 
identically on LDBr, hence the function h\ L has a set of critical points which contains L(~]Br 
and hence is non-isolated. As for the condition (A.2), it is a non-degeneracy assumption, as 



we want to avoid that the gradient of h vanishes identically in the interior of L n Br. The 
condition (Al) is crucial, whereas (A2) is somehow just technical, as we believe it can be 
removed in general. Following the terminology of [Boul2b], functions which do satisfy (Al) 
are functions which are not rationally steep. 

Given a small parameter < e < 1, our perturbation ef will be a "generic" function 
ef : T 2 x B R -> R which is "small" for the C 3 topology. For an integer r > 2, let C r (T 2 x B R ) 
the space of C r function / : T 2 x Br — > R, which is Banach space with respect to the norm 

l/lcf0PxB«) = SU P SU P l#V(M)| J 

j^ 4 ,\j\<r \(8,I)eT n xB R J 

where we have used the standard multi-index notation. We extend the definition of the C r - 
norm for vector- valued functions F = (/i, . . . , f m ) '■ T 2 x Br — > R m , for an arbitrary integer 
m > 1, by setting 

\F\c r (T 2 xB R ,M m ) = SUp \fi\c r (T 2 xB R )- 
l<i<m 

Let us denote by C[(T 2 x 5r) the unit ball of C(T 2 x Br) with respect to this norm, that is 

C[(T 2 x B R ) = {/ G C r (T 2 x S fl ) | |/|cr (TaxBa) < 1}. 

Our perturbation e/ will be such that / belongs to an open and dense subset F%. of C 3 (T 2 x 
Br), depending on the vector k defined in (Al) and the point i* defined in (A2). For a 
given function / G C 3 (T 2 xB R ),we define /£ G C 3 (T 2 ) by 



/*W= / f(e + tk,r)dt, 

Jo 



then 7"^ is defined by 

H = {/ € C 3 (T 2 x Sfl) I 3 0* G T 2 , 9 e / fc *(r) / 0}. 

In words, J 7 ^ is the subset of functions / G C 3 (T 2 x Br) such that /£ is a non-constant 
function: this is obviously an open and dense subset of Cf(T 2 x Br). Note that f£ is a 
function on T 2 , but by definition it is constant on the orbits of the linear flow of frequency 
k, hence it can be considered as being defined on the space of orbits (the leaf space) of this 
flow, which is diffeomorphic to T. 

We can finally state our main result. 

Theorem 1.1. Let H = h + ef be defined on T 2 x Br, with h G Ci(Br) satisfying (Al) and 
(A2) and f G J-^. Then there exists a positive constant C , depending only on h, and positive 
constants Sq and 5 depending also on f, such that for any < e < Eq, the Hamiltonian system 
defined by H has a solution (9(t),I(t)) such that 

\I(t)-I{0)\>C6 2 , T = Se~ 1 . 

It is a statement of diffusion for the action variables, in the sense that they have a variation 
which is bounded from below independently of e, for all e small enough. It has to be noted 
that the time of diffusion r = fe _1 is essentially optimal in the sense that for all / G 
C 2 (T 2 x B R ) n C\{T 2 x B R ), for all e > and for all < S < 1, we have 

\I(t)-I \ <5 



for all solutions of H = h + ef. In particular, for the solution given by Theorem ll.il one has 
the inequalities 

C5 2 < |/(r) - I 1 < 5. 

Concerning the dependence of the constants involved, the dependence on h is only through 
R, the vector k and the constant a that appeared in (C.l), the norm of V/i(/*) and R* that 
appeared in (C.2), while the dependence on / is only through the absolute value of dof£(9*). 
We refer to Theorem 12.11 in £ )2.1I for a more concrete and precise statement. 

Let us now discus some particular cases of functions h satisfying (A.l) and (A.2), and 
therefore for which one has diffusion for a generic perturbation. As we will explain later, 
we can always assume without loss of generality that a = in (A.l), and upon adding an 
irrelevant additive constant, we can assume that the restriction of h to L is identically zero. 

For a linear Hamiltonian h(I) = u ■ I, it follows that (.A.l) and (A.2) are satisfied if and 
only if ui is resonant, that is I • w = for some l£Z 2 \ {0}, and u) is non-zero. On the other 
hand, if u; is non-resonant, it follows from [Boul2b] that the statement of Theorem 11.11 cannot 
be true for any sufficiently small perturbation, since for all sufficiently small perturbation, one 
has stability for an interval of time which is strictly larger than [— r, r] with r as above. In 
particular, if u is Diophantine, one has stability for an interval of time which is exponentially 
long with respect to e , up to an exponent depending only on the Diophantine exponent of 
u. 

Now for a quadratic Hamiltonian h(I) = AI ■ I where A is a 2 by 2 symmetric matrix, 
(A.l) and (A.2) are satisfied if and only if there exists a vector I £ 1? \ {0} such that Al ■ I = 
and Al 7^ 0. Assuming that A is diagonal, its eigenvalues have to be of different sign, and 
writing h(I) = d\ I2 — oi^I^, (A.l) and (A.2) are satisfied if and only if ot\ 7^ 0, ct2 7^ and 
oliJol\ G Q. The example described in the introduction corresponds to a.\ = a<i = 1. On 
the other hand, one knows that if 02/01 is irrational, the statement of Theorem 12.11 cannot 
be true for any sufficiently small perturbation for the same reason as above: for instance, if 
OLijot\ is a Diophantine number, the quadratic Hamiltonian falls into the class of Diophantine 
steep functions introduced in [Nie07j and it follows from results in [Nie07| or |BN12j that such 
Hamiltonians are stable for an exponentially long interval of time. 

Note that in these two special cases, the condition (A.2), which amounts to uj 7^ in the 
first case and Al 7^ in the second case, can be easily removed. 

We already explained that the time of diffusion r is in some sense optimal, regardless 
of the integrable Hamiltonian h. Now we believe that if we fix the time of diffusion, the 
condition (A.l) on the integrable Hamiltonian h is also in some sense optimal, as if h does 
not satisfy this assumption, one can have diffusion but with a time strictly greater than r. 
This is indeed the case for linear or quadratic integrable Hamiltonians as we described above, 
and the general case is conjectured in [Boul2b|. 

2 Proof of Theorem 11.11 

In §2.1| we will perform some preliminary transformations to reduce Theorem 11.11 to an 
equivalent but more concrete statement, which is Theorem 12.11 Theorem 12.11 will be proved 
in 5Z2J based on a normal form result which is stated and proved in 



2.1 Preliminary reductions 

Let us first give more concrete formulations of the conditions (A.l) and (A.2). 

First we may assume that the line L in (A.l) passes through the origin, that is L = 
{(ii, J2) £ M 2 | kill + ^2/2 = 0}: indeed, we can always find a translation of the action 
variables T : M 2 ->• M 2 such that T sends {(h,h) G K 2 I fail + fa/a = 0} to {(Ii,I 2 ) G 
M 2 I feiJi + ^2-^2 + a = 0}, and since the map $t(0, -0 = (0, rj) is symplectic, the statement 
holds true for H if and only if it holds true for H o $y, up to constants depending on a. 

Then we can suppose that the components of the vector k = (fci, ^2) G Z 2 \ {0} are 
relatively prime, since changing k by k/p, where p is the greatest common divisor of k\ and 
k 2 , does not change the definition of L. Hence we may assume that in fact k = e 2 = (0, 1), 
that is L = {(Ji,J 2 ) G M. 2 | I 2 = 0}: indeed we can always find a matrix M G GL 2 (Z) 
such that its second row is k, hence Me 2 = k and t M~ l sends {(/i,/ 2 ) £ R 2 | 7 2 = 0} to 
{(il,I 2 ) G M 2 I fci/i + A; 2 / 2 = 0}. The map <$> M (6,I) = (MO? M~ l I) is well-defined since 
MT 2 = T 2 , and it is symplectic, so the statement holds true for H if and only if it holds true 
for H o <& M , up to constants depending on k. 

Note that the symplectic transformations &t an d ^m do change the domain Br in the 
space of actions, but to simplify the notations, we will assume that the latter is fixed. 

Now for all / = (ii,/ 2 ) G Br, let us write 

Vh(I)=oj(I) = ( Wl (J),W2(i)) = (wi(Ii,I 2 ),u; 2 (Ii,I 2 ))GR 2 . 

The condition (A.l) is that h is constant on L = {(7i,/ 2 ) £ M 2 | / 2 = 0}, which is obviously 
equivalent to 81^(11,0) = wi(Ji,0) = for all I\ such that |/i| < R. The condition (A.2) is 
that there exists I{ and < R* < R with |JJ| < R* such that w 2 (/*) = w 2 (/*,0) = w* ^ 0. 
Changing // to — H if necessary and reversing the time accordingly, we may assume that 
u* >0. 

We can eventually formulate simplified conditions, that we call (B.l) and (B.2): 

(B.l) For all Ii such that |7i| < R, we have cji(Ii,0) = 0. 

(B.2) There exists i| and < R* < R with \I%\ < R* such that u 2 (I*) = oj 2 (I*,0) = 
co* > 0. 

Then the definition of J^* 2 also simplifies: one easily check that for / £ Cf(T 2 x Br), we 
have f* 2 £ Cf(T) where 






so that / £ T* 2 if and only if there exists 9* £ T for which de 1 f* 2 (6\) 7^ 0. For simplicity, we 
write /* = /* and T* 2 = F* , and for / £ J 7 *, we denote by A a lower bound on the absolute 
value of d e J*(ei). 

From the previous discussion, it follows that Theorem 11.11 is implied by the following 
statement. 

Theorem 2.1. Let H = h + ef be defined on T 2 x Br, with h £ Of (Br) satisfying (B.l) 
and (B.2) and f £ T* . Then there exists a positive constant C , depending only on R, R* and 
uj* , and a positive constant £0 depending also on X, such that for any < e < £q, if we set 
5 = A(4C) _1 , the Hamiltonian system defined by H has a solution (6(t),I(t)) such that 

\Ii(t)-Ii(0)\>C5 2 , r = 5e-\ 



2.2 A normal form 

The main ingredient of the proof of Theorem 12.11 will be a normal form on a domain in the 
space of action whose size, in the direction given by the first action variables I±, is independent 
of e. Let us define 

p = min{(R - R*)/2,uj* /£} 

and the domain D* by 

d* = {{h, i 2 ) g m 2 1 \h - 1{\ < P , h = o}. 

Let us furthermore define k = 4/w* and consider the ke- neighbourhood D*(ke) of the domain 
D*, defined by 

D* p {ks) ={/6l 2 d(I,D*) < ks} = {{h,h) G K 2 | \h ~ I*\ <P + K£, \h\ < ke} 

where the distance is the one induced by the supremum norm on M 2 . Eventually, we define 
V*(ke) = T 2 xD*(ke). 

In the statement and proof of the proposition below, to avoid cumbersome notations, 
when convenient we will use a dot • in replacement of any constant depending only on R, R* 
and u*, that is for any two quantities u and v, an expression u <• v means that there exists a 
constant c depending only on R, R* and oj* such that u < cv. 

Proposition 2.2. Let H = h + sf be defined on T 2 x Br, with h G Cf(Bji) satisfying (B.l) 
and (B.2) and f G J-*. Assume that ke < p. Then there exists a symplectic embedding 
$ : V* p (ke/2) -> V* p (ke) of class C 2 such that 

Ho$ = h + ef + ef, M,/)= f f(e 1 ,e 2 ,i)de 2 

and, if <& = (&g,&i), we have the following estimates 

\®I - ^\c°(V*(Ke/2),R 2 ) - K£ /2, \def'\c°(V*(KE/2)M 2 ) < £ ' \@I f'\c°CD*(Ke/2),R 2 ) < 1 " 

The proof of this proposition uses some elementary estimates which are recalled in the 
Appendix lAl 

Proof. First of all, note that since ke < p, by definition of p the domain D*(ke) is included 
in Br. For a function x '■ ^p( K£ ) — >■ K of class C 3 to be chosen below, the transformation 
$ in the statement will be obtained as the time-one map of the Hamiltonian flow generated 
by ex- Let X £X be the Hamiltonian vector field generated by ex, and X l £X the time-t map. 
Assuming that X* is well-defined on T>*(ke/2) for |t| < 1, let <3? = X^ . Using the relation 

f t (KoXt x )=e{K, X }oXt x 
for an arbitrary function K, and writing 



we can apply Taylor's formula to the right-hand side of the above equality, at order two for 
the first term and at order one for the second term, to get 



-ex 



Ho$ = h + e{h, x } + e 2 [ (l-t){{h, x },x} 0X^ + 8/ + e 2 [ {f, X }oX t £ 

Jo ' Jo 

= h + e({h, X } + f) + e 2 [ {(1 - t){h, X } + f,x}° X l £X 

Jo 

= h + ef + e({h,x} + f-f)+e 2 [ {(l-t){h, x } + f,x}oX t ex (1) 

Jo 

where / is the function defined in the statement. It would be natural to choose x to solve the 
equation {h, x} + 9 = where g = f — f, which can be written as {%, h} = g, but to avoid 
the use of Fourier expansions and hence regularity issues, we will only solve this equation 
approximatively. 

Let us define the projection II : D*(ke) ->■ D*(K£)n{I 2 = 0} = D* J+K£ , and for I G D*(ks), 
we write 11(7) = I. For any / = (Ji, I 2 ) G D*(ne), we obviously have 

\I-I\<K£ (2) 

and uj(I) = (wi(Ji,0),W2(Ii,0)) = (0,w 2 (ii,0)) by (5.1). Moreover, for any I = (h,I 2 ) G 
D*(ne), \Ii — 1*\ < p + ke <2p and as \h\ C ii B \ < 1, we have 

\oj 2 (h, 0) - w 2 (/i , 0)| < 2p < w*/2 

by definition of p. Since (^(I^O) = uj* > by (-B.2), it follows that W2(/i,0) > uj* /2 for any 
I = (Ji, I2) G D*(ke). Now observe that the equation {x, h} = g can be written again as 

u(I).dex(e,I) = g(0,I), (9,I)ev;(ns). (3) 

Instead of solving this equation, we will solve the equation 

u(f).de X (0, 1) = g{9, I), (9, 1) G 2?;(«e) (4) 

which can be written again as 

uj 2 (h,o)dg 2 x(0,i) = 9(0,1), (e,i)ev;(Ke) (5) 

since uj(I) = (Q,uj 2 (Ii,Q)). We claim that the equation §5§ is solved by 

1 '- 1 



*(*> J ) = /r m / s(* + tea, /)*d*. (6) 

W2(il,U) Jo 



where e2 = (0, 1). First recall that g = f — f and therefore 
-1 r i /•! 



/ g(9 + te 2 ,I)dt = I f(9 + te 2 ,I)dt- f f(9 + te 2 ,I)dt = f(9 1 ,I)-f(9 1 ,I) = 0. 
Jo Jo Jo 

Then we compute 

u 2 (h,0)d e2 x(9,I) = de 2 (jg(9 + te 2 ,I)tdtj=jd e2 g(9 + te 2 ,I)tdt 



(—g(9 + te 2 , 1) J tdt = g(9 + te 2 ,I)t\[ - [ g(0 + lr 2 . T),ll 
g(9 + e 2 ,I)=g(9,I) 



Jo 



where we have used the chain rule and an integration in the second line. 

Now h G Cf(B R ), f G C 3 (T 2 x B R ) and w 2 (/i,0) > u*/2 for any / = (h,h) G #p(«e): 
it follows from Q and Leibniz formula (inequality (|18p of Appendix [A]) that x is of class C 3 



and 
whereas 



lx|c 3 (X>*(« £ )) < 1 ( 7 ) 



l^xlco(D;( Ke ),R2) < 2/w* = k/2 (8) 

for any j G N 2 , |j| < 2. So in particular, from fl7J) and (jSJ), the function ex £ C 3 (T 2 x Br) 
satisfies 

l £ Xlc* 2 (D(Ke)) < -£ i l^exlc , °(X'(K£),IR 2 ) < K£ / 2 ; 

hence we can apply Lemma lA. II of Appendix [XJ for all |i| < 1, X\ : T>(ke/2) — > D(ne) is a 
well-defined symplectic embedding of class C 2 , and if we write X\ = ($^,$j), we have the 



estimates 



1^/ _ Id| C 0( X ,( Ke / 2 ),R 2 ) < KE/2, \X £X - Id| c i(x,( KE /2),IR 4 ) < e i \X £X \c1(V(ke/2)M 4 ) < 1 ' ( 9 ) 

In particular, the first estimate gives 

1$/ ~ id lc*°(Z>;(K£/2),R 2 ) < K£ / 2 - 
Now let us define R x by #i(0,I) = (w(I) - w(J)) • <%x(0,I) for (0,/) G X>*(«e) and 

f' = Rx + R 2 , R 2 = e [ {(l-t){h, x } + f,x}oX t £X 

Jo 

It follows from the equalities (P), ([3]) and ([!]) that 

so it remains only to estimate the partial derivatives of /'. The estimates 

\9eRl\c°(V* p (ne/2),K 2 ) < £ i l<9/-Rl|c°(Z>;(Ke/2),R 2 ) < 1 

follow easily from ([2]), (|7|) and ([8]). Then we have 

\R , 2\c 1 (V* p (Ke/2)) <• £ |{(1 - t){h,X} + / ; X}Ic* 1 (»;(k£))I^£xIc* 1 (X'(k£/2),R 4 ) 

<• £ \{{h,x} + f,x}\cHV* p ( K£ )) 

<• eliAx} + /lc 2 (o*(Ke))lxlc 2 (o*(Ke)) 

<• £ (J {A *} I C 2 (©*(«£)) + l/b 2 (D*(K£))J 

<• £ \\h\c3(p*(Ke))\x\c3(V*(Ke)) + l/lc* 2 (B;( K£ ))j 
<• £ 

where we have used the last part of ([9]), the fact that h G Of (Br) and / G C 3 (T 2 x Br), the 
estimate ([7]) and the inequality (|19p of Appendix lAl several times. This implies that 

\deR2\c°(V* p (Ke/2),R 2 ) < £ ' \dl R 2\c°(V p -(Ke/2),R 2 ) < £ 

and since /' = R± + R 2 , we eventually obtain 

\def'\c°(V*(K£/2),R 2 ) < £ ' \®I f'\c (V*(Ke/2),9?) <• 1 

which concludes the proof. □ 
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2.3 Proof of Theorem [231 

The proof of Theorem 12. II is now a consequence of our normal form Proposition ^. 21 Since the 
latter is defined on a domain which is independent of e in the /i-direction, it will be possible 
to prove the statement of Theorem 12.11 for the normal form H o $ by analyzing directly the 
equation of motions, and using the fact that <J> is close to the identity, we will prove that the 
statement remains true for H. 

Proof of Theorem \2.1[ Recall that we are considering H = h + ef defined on T 2 x Br, with 
h G Cf(Bji) satisfying (B.l) and (5.2) and / G J 7 *, so we can apply Proposition ^. 21 assuming 
ke < p, there exist positive constants C\ and C 2 depending only on R, R* and uj* and a 
symplectic embedding $ : T>*(ke/2) — > V*(ke) of class C 2 such that 

Ho$ = h + ef + ef', M,/)= [ f(e 1 ,e 2 ,i)de 2 . 

Moreover, if <J> = ($#, <!?/), we have the following estimates 

|$j - Id| c o ( ^ (Ke/2)iR2) < ke/2, (10) 

and 

\def Ic°(x>;(k£/2),r 2 ) < Ci £ > |9//'|co(D*( Ke /2),R 2 ) < C 2 (11) 

where p, k, D*(ke) C B r and V*(ke) <^T 2 x B r have been defined in 32.21 

Let us consider the Hamiltonian H = H $ defined on T>*(ke/2), and we shall write 
$(0,1) = (0,1). Let I* = (11,0) be given by (B.2), and 0J such that 

\d e ~ x nei)\ = \d e ~ i m,n\>^ w 

Note that necessarily A < 1 since / G Ci(T 2 x -Br). We consider a solution (8(t),I(t)) of the 
system defined by i? with an initial condition (0(0), 1(0)) such that 1(0) = I*, 0i(O) = 0* and 
02 (0) G T arbitrary: we have the equations 

ih(t) = -Ed § j(h(t)J(t)) - edgj'{e(t)J{t)), 

p 2 (t) = -Ed e ~ 2 r(e(t)J(t)), (l3) 

ie 1 (t) = a Jl (i(t)) + Ed f J(e 1 (t),i(t)) + Ed fi f'(e(t),i(t)), 

{ie 2 (t)=u; 2 (I(t))+ed I J0 l (t)J(t)) + ed f2 f0(t),I(t)), 

since / is independent of the second angular variables. For a positive constant 5 to be chosen 
later in terms of A, we let r = 5e~ 1 . From the second equation of ()13j) and the first estimate 
of (fTT|) . we get 

\I 2 (t)-i 2 (0)\ = \I 2 (t)\<C l E5, \t\<r, 

which makes sense provided that \I 2 (t) — I 2 (0)\ < ke/2 for |t| < r, and this is satisfied if 
Ci<5 < k/2, that is 5 < k(2Ci)~ 1 . Now for \t\ < r, recalling that ui(h(t) , I 2 (Q)) = and 
h G Cf(Bfi), we have 

|wi(/(t))| = |wi(Ji(t),/ 2 (t))| = \ui{h{t)J 2 (t))-ui{h(t)J 2 (0))\ < \I 2 (t) - 1 2 (0)\ < deS. 



11 



Therefore, from the third equation of (|13p . the second estimate of (jlip and the fact that 
/GCf(Tx B R ), we have 



d 

di 



l(*) 



< Cie5 + e + C 2 e = (d<J + 1 + C 2 )e < Cfe, |t| < r 



with C = C% + 1 + C2, provided 5 < 1. This implies that 

|£i(i) - 0*| = |^i (t) - 0i (0)| < C6, \t\ < t. 



(14) 



Moreover, recall that \l2(t) — -^(0)1 < C\e8 < C<5 for \t\ < r by the definition of C and 
since e < 1, and from the first equation of (|13p . the first estimate of (jlip and the fact that 
/ € Cf (T x B#), we also have 

\h(t) - Il\ = \h(t) - /i (0) | < 5 + deS, \t\ < r, 

which makes sense if 5 < k(2Ci) _1 as this implies that \Ii(t) — I*\ < <5 + «e/2. In particular 

i/i(t) - iri = |/i(t) - /!(o)i < C5, i*i < r, 

by the definition of C and since e < 1 and therefore 

|I(t)- 1*1 = |J(t)- 1(0)| <Ctf, |t|<r. (15) 

Using the fact that / € Cf (T x Br), from flUD and (JEJ we obtain 

|^/(0i(t), /(*)) - ^M*,r)| < C8, \t\ < r. (i6) 

We eventually choose 5 = A(4C) _1 , and hence r = A(4C) _1 e _1 . We have to make sure that 
6 < 1 and J < «(2Ci) . The first requirement is obviously satisfied since C > 1 and A < 1. 
For the second, note that w* < 1 since h £ Cf(Bji), so k > 1 hence A < 1 < 2k and this 
implies that 5 < k(2Ci)~ 1 as C > Ci. Now from (|12p . (|16j) and the definition of 5, we have 
for all |i| < r, 

la^/^iW, i(t))| > \d §i m,i*)\ - l\/(»i(t),/(i)) - d §i m,i*)\ > a - cs > sa/4. 

Moreover, if we assume that e < (4Ci) _1 A, then from the first estimate of (jlip . we have 

\d e ~j'(d{t),i(t))\<Cie<\/A, \t\<r, 
and this gives, as before, 

\d§J0l(t)J(t)) + d -J'(e(t)J(t))\ > 3A/4 - A/4 = A/2, \t\ < r. 
Now from the first equation of f)13|) . we obtain 

d 



di 



h(t) 



> eX/2, \t\ < t, 



which eventually gives 



|/i(r) - /i(0)| > reA/2 = A 2 (8C) _1 . 
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Coming back to the original Hamiltonian, <&(6(t),I(t)) = (0(t),I(t)) is a solution of the 
Hamiltonian H, and from the estimate (|l(jp . we have 

\h(t) - h(t)\ < ke/2 

as long as I(t) G D* p (ne/2), so in particular 

|/i(0) - /i(0)| < ke/2, |/i(r) - 7i(t)| < ke/2. 

Assuming that e < A 2 (16(7«) -1 , this gives 

|/i(r)-/i(0)|>|/ 1 (T)-/ 1 (t)|-|/ 1 (T)-/ 1 (r)|-|/ 1 (0)-/ 1 (0)|>A 2 (8C)- 1 -^>A 2 (16C)- 1 . 

Summing up, if we define 

e = mm{ P K- 1 ,X(4C 1 )-\\ 2 (16CK)- 1 } 

then for e < Eq, if 5 = A(4C) _1 and r = fe -1 , the Hamiltonian H has a solution (6(t),I(t)) 
for which 

|/i(r) - li(0)| > A 2 (16C)^ 1 = C5 2 . 

This was the statement to prove. D 

A Technical estimates 

Let I? be a bounded domain in M. 2 of diameter 2p, and for < e < 1 and a positive constant 
k, consider the domains D(ke) = {I G M. 2 \ d(I, D) < ke} and V(ke) = T 2 x D(ke). 

Let us begin by recalling some elementary estimates. First if / £ C r (T>(K£)) for r > 2, 
then for j G N 4 , \j\ < r, d l f G C t '^^(T>(ke)) and obviously 

\9 f\cr-\i\{D(Ke)) ^ l/lc(D( K£ ))- (17) 

In particular, this implies that if / G C r (T>(n£)), then its Hamiltonian vector field Xf is of 
class C r_1 and 

\ X f\c r - 1 CD(Ke),R 4 ) - \f\c r (V(Ks))- 

Then, given two functions f,g£ C r (V(K£)), the product fg belongs to C t (V{ke)) and by the 
Leibniz formula 

\fg\c r (v(K, £ )) < c( r )\f\c r (v(Ks))\g\c r (v(Ke))- (is) 

for some constant depending only on r. By (|17p and (|18p . the Poisson Bracket {/, g} belongs 
to C r - l {V{K£)) and 

|{/,3}|c-iCD(«e)) < c(r)|/|c-(o( K£ ))blc"-(D(« £ ))- (19) 

for another constant c{r) depending only on r. 

We shall also need the following lemma, which follows easily from Faa di Bruno's formula 
(see for instance [AR67] ) and classical results on the existence and regularity of solutions of 
differential equations. 
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Lemma A.l. Let f £ C ?J (T>(ne)), and assume that 

\f\c 2 (V{e)) < Ce, \def\cO(V(e),R 2 ) - K£ /2, 

for some positive constant C . Then, for all \t\ < 1, X\ : V{ne/2) — > V(ke) is a well-defined 
symplectic embedding of class C 2 , and if we write X\ = (<&g, <&\), we have the estimates 

1 3*7 ~ ^\c°(V(ks/2),M. 2 ) < K£ A \ X f - Id|c 1 (»(te/2),IR 4 ) < C\E, |^/|c 1 (»(k£/2),R 4 ) < C 2 

for some constant C\, C% depending only on C , p and n. 

Note that the constants C\ and C% depend only on C and on the diameter of D(ke), but 
since e < 1, the latter is bounded by 2(p + k) where p is the diameter of D. 

The proof of the above lemma is a simple adaptation of Lemma 3.15 in [DH09], see also 
Lemma A.l in [Boul2aJ. 
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